179 research outputs found
Power laws, Pareto distributions and Zipf's law
When the probability of measuring a particular value of some quantity varies
inversely as a power of that value, the quantity is said to follow a power law,
also known variously as Zipf's law or the Pareto distribution. Power laws
appear widely in physics, biology, earth and planetary sciences, economics and
finance, computer science, demography and the social sciences. For instance,
the distributions of the sizes of cities, earthquakes, solar flares, moon
craters, wars and people's personal fortunes all appear to follow power laws.
The origin of power-law behaviour has been a topic of debate in the scientific
community for more than a century. Here we review some of the empirical
evidence for the existence of power-law forms and the theories proposed to
explain them.Comment: 28 pages, 16 figures, minor corrections and additions in this versio
Handling oversampling in dynamic networks using link prediction
Oversampling is a common characteristic of data representing dynamic
networks. It introduces noise into representations of dynamic networks, but
there has been little work so far to compensate for it. Oversampling can affect
the quality of many important algorithmic problems on dynamic networks,
including link prediction. Link prediction seeks to predict edges that will be
added to the network given previous snapshots. We show that not only does
oversampling affect the quality of link prediction, but that we can use link
prediction to recover from the effects of oversampling. We also introduce a
novel generative model of noise in dynamic networks that represents
oversampling. We demonstrate the results of our approach on both synthetic and
real-world data.Comment: ECML/PKDD 201
Evolutionary dynamics of the cryptocurrency market
The cryptocurrency market surpassed the barrier of $100 billion market capitalization in June 2017, after months of steady growth. Despite its increasing relevance in the financial world, a comprehensive analysis of the whole system is still lacking, as most studies have focused exclusively on the behaviour of one (Bitcoin) or few cryptocurrencies. Here, we consider the history of the entire market and analyse the behaviour of 1469 cryptocurrencies introduced between April 2013 and May 2017. We reveal that, while new cryptocurrencies appear and disappear continuously and their market capitalization is increasing (super-)exponentially, several statistical properties of the market have been stable for years. These include the number of active cryptocurrencies, market share distribution and the turnover of cryptocurrencies. Adopting an ecological perspective, we show that the so-called neutral model of evolution is able to reproduce a number of key empirical observations, despite its simplicity and the assumption of no selective advantage of one cryptocurrency over another. Our results shed light on the properties of the cryptocurrency
market and establish a first formal link between ecological modelling and the study of this growing system. We anticipate they will spark further research in this direction
Measuring the evolution of contemporary western popular music
Popular music is a key cultural expression that has captured listeners'
attention for ages. Many of the structural regularities underlying musical
discourse are yet to be discovered and, accordingly, their historical evolution
remains formally unknown. Here we unveil a number of patterns and metrics
characterizing the generic usage of primary musical facets such as pitch,
timbre, and loudness in contemporary western popular music. Many of these
patterns and metrics have been consistently stable for a period of more than
fifty years, thus pointing towards a great degree of conventionalism.
Nonetheless, we prove important changes or trends related to the restriction of
pitch transitions, the homogenization of the timbral palette, and the growing
loudness levels. This suggests that our perception of the new would be rooted
on these changing characteristics. Hence, an old tune could perfectly sound
novel and fashionable, provided that it consisted of common harmonic
progressions, changed the instrumentation, and increased the average loudness.Comment: Supplementary materials not included. Please see the journal
reference or contact the author
Error and attack tolerance of complex networks
Many complex systems, such as communication networks, display a surprising
degree of robustness: while key components regularly malfunction, local
failures rarely lead to the loss of the global information-carrying ability of
the network. The stability of these complex systems is often attributed to the
redundant wiring of the functional web defined by the systems' components. In
this paper we demonstrate that error tolerance is not shared by all redundant
systems, but it is displayed only by a class of inhomogeneously wired networks,
called scale-free networks. We find that scale-free networks, describing a
number of systems, such as the World Wide Web, Internet, social networks or a
cell, display an unexpected degree of robustness, the ability of their nodes to
communicate being unaffected by even unrealistically high failure rates.
However, error tolerance comes at a high price: these networks are extremely
vulnerable to attacks, i.e. to the selection and removal of a few nodes that
play the most important role in assuring the network's connectivity.Comment: 14 pages, 4 figures, Late
Popularity versus Similarity in Growing Networks
Popularity is attractive -- this is the formula underlying preferential
attachment, a popular explanation for the emergence of scaling in growing
networks. If new connections are made preferentially to more popular nodes,
then the resulting distribution of the number of connections that nodes have
follows power laws observed in many real networks. Preferential attachment has
been directly validated for some real networks, including the Internet.
Preferential attachment can also be a consequence of different underlying
processes based on node fitness, ranking, optimization, random walks, or
duplication. Here we show that popularity is just one dimension of
attractiveness. Another dimension is similarity. We develop a framework where
new connections, instead of preferring popular nodes, optimize certain
trade-offs between popularity and similarity. The framework admits a geometric
interpretation, in which popularity preference emerges from local optimization.
As opposed to preferential attachment, the optimization framework accurately
describes large-scale evolution of technological (Internet), social (web of
trust), and biological (E.coli metabolic) networks, predicting the probability
of new links in them with a remarkable precision. The developed framework can
thus be used for predicting new links in evolving networks, and provides a
different perspective on preferential attachment as an emergent phenomenon
SCI
Persistent homology is a powerful tool in Topological Data Analysis (TDA) to capture the topological properties of data succinctly at different spatial resolutions. For graphical data, the shape, and structure of the neighborhood of individual data items (nodes) are an essential means of characterizing their properties. We propose the use of persistent homology methods to capture structural and topological properties of graphs and use it to address the problem of link prediction. We achieve encouraging results on nine different real-world datasets that attest to the potential of persistent homology-based methods for network analysis
Computational fact checking from knowledge networks
Traditional fact checking by expert journalists cannot keep up with the
enormous volume of information that is now generated online. Computational fact
checking may significantly enhance our ability to evaluate the veracity of
dubious information. Here we show that the complexities of human fact checking
can be approximated quite well by finding the shortest path between concept
nodes under properly defined semantic proximity metrics on knowledge graphs.
Framed as a network problem this approach is feasible with efficient
computational techniques. We evaluate this approach by examining tens of
thousands of claims related to history, entertainment, geography, and
biographical information using a public knowledge graph extracted from
Wikipedia. Statements independently known to be true consistently receive
higher support via our method than do false ones. These findings represent a
significant step toward scalable computational fact-checking methods that may
one day mitigate the spread of harmful misinformation
Network 'small-world-ness': a quantitative method for determining canonical network equivalence
Background: Many technological, biological, social, and information networks fall into the broad class of 'small-world' networks: they have tightly interconnected clusters of nodes, and a shortest mean path length that is similar to a matched random graph (same number of nodes and edges). This semi-quantitative definition leads to a categorical distinction ('small/not-small') rather than a quantitative, continuous grading of networks, and can lead to uncertainty about a network's small-world status. Moreover, systems described by small-world networks are often studied using an equivalent canonical network model-the Watts-Strogatz (WS) model. However, the process of establishing an equivalent WS model is imprecise and there is a pressing need to discover ways in which this equivalence may be quantified.
Methodology/Principal Findings: We defined a precise measure of 'small-world-ness' S based on the trade off between high local clustering and short path length. A network is now deemed a 'small-world' if S. 1-an assertion which may be tested statistically. We then examined the behavior of S on a large data-set of real-world systems. We found that all these systems were linked by a linear relationship between their S values and the network size n. Moreover, we show a method for assigning a unique Watts-Strogatz (WS) model to any real-world network, and show analytically that the WS models associated with our sample of networks also show linearity between S and n. Linearity between S and n is not, however, inevitable, and neither is S maximal for an arbitrary network of given size. Linearity may, however, be explained by a common limiting growth process.
Conclusions/Significance: We have shown how the notion of a small-world network may be quantified. Several key properties of the metric are described and the use of WS canonical models is placed on a more secure footing
Comparison of contact patterns relevant for transmission of respiratory pathogens in Thailand and the Netherlands using respondent-driven sampling
Understanding infection dynamics of respiratory diseases requires the identification and quantification of behavioural, social and environmental factors that permit the transmission of these infections between humans. Little empirical information is available about contact patterns within real-world social networks, let alone on differences in these contact networks between populations that differ considerably on a socio-cultural level. Here we compared contact network data that were collected in the Netherlands and Thailand using a similar online respondent-driven method. By asking participants to recruit contact persons we studied network links relevant for the transmission of respiratory infections. We studied correlations between recruiter and recruited contacts to investigate mixing patterns in the observed social network components. In both countries, mixing patterns were assortative by demographic variables and random by total numbers of contacts. However, in Thailand participants reported overall more contacts which resulted in higher effective contact rates. Our findings provide new insights on numbers of contacts and mixing patterns in two different populations. These data could be used to improve parameterisation of mathematical models used to design control strategies. Although the spread of infections through populations depends on more factors, found similarities suggest that spread may be similar in the Netherlands and Thailand
- …